Semi Supervised Learning of Online Data Streams with Max Flow Algorithm

نویسنده

  • Vina M. Lomte
چکیده

This paper aims to propose the semi-supervised learning system that deals with huge and dynamic data in time and memory efficient manner. In today’s world the numbers of applications are available in information system which deals with large amount of data in changing environment. In such environments, this data is available in the form online streams. The memory and time limitations are the major aspects which need to be considered in online learning which are due to volume as well as speed of the data. In online streams, small quantity of information is labelled and huge quantity of unlabelled information is accessible. Consequently, semi regulated learning is the best way to deal with take in the information in order to decrease human efforts and as yet accomplishing better precision also the performance of the system. This paper examined the concept of semi supervised learning. For learning the data stream max flow algorithm is used in this paper. The Max Flow/ Min Cut algorithm is applied for achieving the accuracy of the proposed system. The algorithm shows improved efficiency and enhancement in performance. To further enhance the performance, Cosine Similarity and Feature Selection Techniques are used. The comparative study shows that the proposed system gives better performance in terms of time and memory efficiency than the old learning system. The system is tested using KDD99 dataset for classification of network intrusion attacks. The system shows the higher accuracy and improved performance results. Application: The proposed system gives solutions to various two class problems. The applications such as network intrusion detection system which can be used to classify anomaly and normal network, online audio background noise reduction of streams, traffic bottleneck identification system, image segmentation techniques where image can be segmented in background and foreground, 3D reconstruction and many more can make use of the proposed system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Concept Drift in Data Stream Using Semi-Supervised Classification

Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...

متن کامل

Learning from Concept Drifting Data Streams with Unlabeled Data

Contrary to the previous beliefs that all arrived streaming data are labeled and the class labels are immediately available, we propose a Semi-supervised classification algorithm for data streams with concept drifts and UNlabeled data, called SUN. SUN is based on an evolved decision tree. In terms of deviation between history concept clusters and new ones generated by a developed clustering alg...

متن کامل

Concurrent Semi-supervised Learning of Data Streams

Conventional stream mining algorithms focus on single and stand-alone mining tasks. Given the single-pass nature of data streams, it makes sense to maximize throughput by performing multiple complementary mining tasks concurrently. We investigate the potential of concurrent semi-supervised learning on data streams and propose an incremental algorithm called CSL-Stream (Concurrent Semi–supervise...

متن کامل

A Graph-based Semi-Supervised Learning Approach and its Feature Selection

With the lack of labeled data, the learning accuracy of a supervised learning algorithm deteriorates. Meanwhile, it is more easy to collect plenty of unlabeled data. Furthermore, a graph can be used to express the underlying distribution of data in the dataset. Thus, a classification problem is converted to a graph partition problem. One typical graph-based semi-supervised learning algorithm is...

متن کامل

Detecting and Tracking Concept Class Drift and Emergence in Non-Stationary Fast Data Streams

As the proliferation of constant data feeds increases from social media, embedded sensors, and other sources, the capability to provide predictive concept labels to these data streams will become ever more important and lucrative. However, the dynamic, nonstationary nature, and effectively infinite length of data streams pose additional challenges for stream data mining algorithms. The sparse q...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017